Using word latice information for a tighter coupling in speech translation systems

نویسندگان

  • Tanja Schultz
  • Szu-Chen Stan Jou
  • Stephan Vogel
  • Shirin Saleem
چکیده

In this paper we present first experiments towards a tighter coupling between Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT) to improve the overall performance of our speech translation system. In coventional speech translation systems, the recognizer outputs a single hypothesis which is then translated by the SMT system. This approach has the limitation of being largely dependent on the word error rate of the first best hypothesis. The word error rate is typically lowered by generating many alternative hypotheses in the form of a word lattice. The information in the word lattice and the scores from the recognizer can be used by the translation system to obtain better performance. In our experiments, by switching from the single best hypotheses to word lattices as the interface between ASR and SMT, and by introducing weighted acoustic scores in the translation system, the overall performance was increased by 16.22%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Word Lattice Information for a Tighter Coupling in Speech Translation Systems

In this paper we present first experiments towards a tighter coupling between Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT) to improve the overall performance of our speech translation system. In coventional speech translation systems, the recognizer outputs a single hypothesis which is then translated by the SMT system. This approach has the limitation of being l...

متن کامل

th Jeju Island , Korea October 4 - 8 , 2004

In this paper we present first experiments towards a tighter coupling between Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT) to improve the overall performance of our speech translation system. In coventional speech translation systems, the recognizer outputs a single hypothesis which is then translated by the SMT system. This approach has the limitation of being l...

متن کامل

Combining natural language processing systems to improve machine translation of speech

Machine translation of spoken language is a challenging task that involves several natural language processing (NLP) software modules. Human speech in one natural language has to be first automatically transcribed by a speech recognition system. Next, the transcription of the spoken utterance can be translated into another natural language by a machine translation system. In addition, it may be...

متن کامل

A Coarse-Grained Model for Optimal Coupling of ASR and SMT Systems for Speech Translation

Speech translation is conventionally carried out by cascading an automatic speech recognition (ASR) and a statistical machine translation (SMT) system. The hypotheses chosen for translation are based on the ASR system’s acoustic and language model scores, and typically optimized for word error rate, ignoring the intended downstream use: automatic translation. In this paper, we present a coarset...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004